Lexical tuning based on triphone confidence estimation
نویسندگان
چکیده
We propose and test a practical means of finding poor pronunciations and missing variants for large lexicons. We do so by statistically assessing the confidence of each phone in each pronunciation and comparing it with the statistical distribution of the same confidence metric for corresponding phones over the entire training corpus. A phone is targeted for correction for each word in which its mean score is significantly less than the phone's mean score over the entire training corpus. Neighboring phones are also reviewed for their contribution to the target phone's poor score. Thus far, we have experimented with this technique by manually correcting the pronunciation. In experiments with Wall Street Journal and dictated physical examination corpora, word error rates were reduced commensurate with the number of dictionary entries whose pronunciations were corrected as result of this process.
منابع مشابه
A two-layer lexical tree based beam search in continuous Chinese speech recognition
In this paper, an approach to continuous speech recognition based on a two-layer lexical tree is proposed. The search network is maintained by the two-layer lexical tree, in which the first layer reflects the word net and the phone net while the second layer the dynamic programming (DP). Because the acoustic information is tied in the second layer, the memory cost is so small that it has the ab...
متن کاملVoice Assimilation Phenomenon and Its Implementation in LVCSR System with Lexical Tree and Bigram Language Model
In this paper a LVCSR system with implementation of the Czech voice assimilation phenomenon is proposed. The recognition system uses lexical trees and a bigram language model. The first part of this article is focused on voice assimilation phenomenon description, triphone lexical tree construction, and voice assimilation impact on LVCSR system performance. The second part outlines lexical tree ...
متن کاملAn EKF-based algorithm for learning statistical hidden dynamic model parameters for phonetic recognition
This paper presents a new parameter estimation algorithm based on the Extended Kalman Filter (EKF) for the recently proposed statistical coarticulatory Hidden Dynamic Model (HDM). We show how the EKF parameter estimation algorithm unifies and simplifies the estimation of both the state and parameter vectors. Experiments based on N-best rescoring demonstrate superior performance of the (contexti...
متن کاملAN Em-BASED ALGORITHM FOR LEARNING STATISTICAL HIDDEN DYNAMIC MODEL PARAMETERS FOR PHONETIC RECOGNITION
This paper presents a new parameter estimation algorithm based on the Extended Kalman Filter (EKF) for the recently proposed statistical coarticulatory Hidden Dynamic Model (HDM). We show how the EKF parameter estimation algorithm unifies and simplifies the estimation of both the state and parameter vectors. Experiments based on N-best rescoring demonstrate superior performance of the (contexti...
متن کاملA Voice Dictation System for a Million-Word Czech Vocabulary
The paper describes a set of techniques developed for discrete dictation within a vocabulary that contains up to a million entries, which is one of the main challenges in highly inflected languages like Czech. We present our approach to building an efficiently coded tree lexicon with suffix sub-trees and morphologic classification. Acoustic modeling is based on either monophone, diphone, or tri...
متن کامل